2 research outputs found

    Performance/cost analysis of a cloud based solution for big data analytic: Application in intrusion detection

    No full text
    International audienceThe essential target of ‘Big Data’ technology is to provide new techniques and tools to assimilate and store large amount of generated data in a way to analyze and process it to get insights and predictions that can offer new opportunities towards the improvement of our life in different domains. In this context, ‘Big Data’ treats two essential issues: the real-time analysis issue introduced by the increasing velocity at which data is generated, and the long-term analysis issue introduced by the huge volume of stored data. To deal with these two issues, we propose in this paper a Cloud-based solution for big data analytic on Amazon Cloud operator. Our objective is to evaluate the performance of Big Data services offered regarding the volume/velocity of the processed data. The dataset we use contains information about”network connections” in approximately 5 million records with 41 features; the solution works as a network intrusion detector. It receives data records in real time from a raspberry pi node and predicts if the connection is bad (malicious intrusion or attack) or good (normal connection). The prediction model was made using a logistic regression network. We evaluate the cloud resources needed to train the machine learning model (batch processing), and to predict the new streaming data with the trained network in real time (real time processing). The solution worked very well with high accuracy and the results show that when working with Big Data in the cloud, we are mainly dealing with a cost/performance trade-off, the processing performance in term of response time for both long-term and real-time analysis can be always guaranteed once the cloud resources are well provisioned according to the needs

    An IoT-Cloud Based Solution for Real-Time and Batch Processing of Big Data: Application in Healthcare

    No full text
    International audienceWith the large use of Internet of Things (IoT) today everything around us seems to generate data. The ever increasing number of connected things or objects (IoT) is coupled with a growing volume of data generated at a continually increasing rate. Especially where data is big or there is a need to process it cloud infrastructures with their scalability and easy access are becoming the solution of choice for storage and processing. In the context of healthcare applications where medical sensors collect health data from patients and send it to the cloud two issues frequently appear in relation to 'Big Data'. The first issue is related to real-time analysis introduced by the increasing velocity at which data is generated especially from connected devices (IoT). This data should be analyzed continuously in real-time in order to take appropriate actions regarding the patient's care plan. Moreover medical data accumulated from different patients over time constitutes an important training dataset that can be used to train machine learning models in order to perform smarter disease prediction and treatment. This gives rise to another issue regarding long-term batch processing of often huge volumes of stored data. To deal with these issues we propose an IoT-Cloud based framework for real-time and batch processing of Big Data in the healthcare domain. We implement the proposed solution on Amazon Cloud operator known as Amazon Web Services (AWS) and use a Raspberry pi as an IoT device to generate data in real time. We test the solution with the specific application of ECG monitoring and abnormality reporting. We analyze the performance of the implemented system in terms of response time by varying the velocity and volume of the analyzed data. We also discuss how the cloud resources should be provisioned in order to guarantee processing performance for both long-term and real-time scenarios. To ensure a good tradeoff between cost and processing performance resources provision should be adapted to the exact needs and characteristics of the considered application
    corecore